Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins.

نویسندگان

  • Tae-Kun Seo
  • Hirohisa Kishino
چکیده

Codon-and amino acid-substitution models are widely used for the evolutionary analysis of protein-coding DNA sequences. Using codon models, the amounts of both nonsynonymous and synonymous DNA substitutions can be estimated. The ratio of these amounts represents the strength of selective pressure. Using amino acid models, the amount of nonsynonymous substitutions is estimated, but that of synonymous substitutions is ignored. Although amino acid models lose any information regarding synonymous substitutions, they explicitly incorporate the information for amino acid replacement, which is empirically derived from databases. It is often presumed that when the protein-coding sequences are highly divergent, synonymous substitutions might be saturated and the evolutionary analysis may be hampered by synonymous noise. However, there exists no quantitative procedure to verify whether synonymous substitutions can be ignored; therefore, amino acid models have been arbitrarily selected. In this study, we investigate the issue of a statistical comparison between codon-and amino acid-substitution models. For this purpose, we propose a new procedure to transform a 20-dimensional amino acid model to a 61-dimensional codon model. This transformation reveals that amino acid models belong to a subset of the codon models and enables us to test whether synonymous substitutions can be ignored by using the likelihood ratio. Our theoretical results and analyses of real data indicate that synonymous substitutions are very informative and substantially improve evolutionary inference, even when the sequences are highly divergent. Therefore, we note that amino acid models should be adopted only after carefully investigating and discarding the possibility that synonymous substitutions can reveal important evolutionary information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular evolution of dinoflagellate luciferases, enzymes with three catalytic domains in a single polypeptide.

Enzymes with multiple catalytic sites are rare, and their evolutionary significance remains to be established. This study of luciferases from seven dinoflagellate species examines the previously undescribed evolution of such proteins. All these enzymes have the same unique structure: three homologous domains, each with catalytic activity, preceded by an N-terminal region of unknown function. Bo...

متن کامل

Inference of genome duplications from age distributions revisited.

Whole-genome duplications (WGDs), thought to facilitate evolutionary innovations and adaptations, have been uncovered in many phylogenetic lineages. WGDs are frequently inferred from duplicate age distributions, where they manifest themselves as peaks against a small-scale duplication background. However, the interpretation of duplicate age distributions is complicated by the use of K(S), the n...

متن کامل

Estimating absolute rates of synonymous and nonsynonymous nucleotide substitution in order to characterize natural selection and date species divergences.

The rate of molecular evolution can vary among lineages. Sources of this variation have differential effects on synonymous and nonsynonymous substitution rates. Changes in effective population size or patterns of natural selection will mainly alter nonsynonymous substitution rates. Changes in generation length or mutation rates are likely to have an impact on both synonymous and nonsynonymous s...

متن کامل

Calibration of multiple poliovirus molecular clocks covering an extended evolutionary range.

We have calibrated five different molecular clocks for circulating poliovirus based upon the rates of fixation of total substitutions (K(t)), synonymous substitutions (K(s)), synonymous transitions (A(s)), synonymous transversions (B(s)), and nonsynonymous substitutions (K(a)) into the P1/capsid region (2,643 nucleotides). Rates were determined over a 10-year period by analysis of sequences of ...

متن کامل

Molecular evolution of a type 1 wild-vaccine poliovirus recombinant during widespread circulation in China.

Type 1 wild-vaccine recombinant polioviruses were isolated from poliomyelitis patients in China from 1991 to 1993. We compared the sequences of 34 recombinant isolates over the 1,353-nucleotide (nt) genomic interval (nt 2480 to 3832) encoding the major capsid protein, VP1, and the protease, 2A. All recombinants had a 367-nt block of sequence (nt 3271 to 3637) derived from the Sabin 1 oral polio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Systematic biology

دوره 57 3  شماره 

صفحات  -

تاریخ انتشار 2008